Search CORE

1,058 research outputs found

Anytime Point-Based Approximations for Large POMDPs

Author: Gordon G.
Pineau J.
Thrun S.
Publication venue: 'AI Access Foundation'
Publication date: 04/10/2011
Field of study

The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other state-of-the-art POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks

arXiv.org e-Print Archive

Crossref

Finding Approximate POMDP solutions Through Belief Compression

Author: Gordon G.
Roy N.
Thrun S.
Publication venue: 'AI Access Foundation'
Publication date: 04/10/2011
Field of study

Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the entire belief space. However, in real-world POMDP problems, computing the optimal policy for the full belief space is often unnecessary for good control even for problems with complicated policy classes. The beliefs experienced by the controller often lie near a structured, low-dimensional subspace embedded in the high-dimensional belief space. Finding a good approximation to the optimal value function for only this subspace can be much easier than computing the full value function. We introduce a new method for solving large-scale POMDPs by reducing the dimensionality of the belief space. We use Exponential family Principal Components Analysis (Collins, Dasgupta and Schapire, 2002) to represent sparse, high-dimensional belief spaces using small sets of learned features of the belief state. We then plan only in terms of the low-dimensional belief features. By planning in this low-dimensional space, we can find policies for POMDP models that are orders of magnitude larger than models that can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and on mobile robot navigation tasks

arXiv.org e-Print Archive

Crossref

Memory Aware Synapses: Learning what (not) to forget

Author: D Hebb
Ferenc Huszár
M McCloskey
MB Ring
O Russakovsky
RM French
S Thrun
Z Li
Publication venue
Publication date: 05/10/2018
Field of study

Humans can learn in a continuous manner. Old rarely utilized knowledge can be overwritten by new incoming information while important, frequently used knowledge is prevented from being erased. In artificial learning systems, lifelong learning so far has focused mainly on accumulating knowledge over tasks and overcoming catastrophic forgetting. In this paper, we argue that, given the limited model capacity and the unlimited new information to be learned, knowledge has to be preserved or erased selectively. Inspired by neuroplasticity, we propose a novel approach for lifelong learning, coined Memory Aware Synapses (MAS). It computes the importance of the parameters of a neural network in an unsupervised and online manner. Given a new sample which is fed to the network, MAS accumulates an importance measure for each parameter of the network, based on how sensitive the predicted output function is to a change in this parameter. When learning a new task, changes to important parameters can then be penalized, effectively preventing important knowledge related to previous tasks from being overwritten. Further, we show an interesting connection between a local version of our method and Hebb's rule,which is a model for the learning process in the brain. We test our method on a sequence of object recognition tasks and on the challenging problem of learning an embedding for predicting

triplets. We show state-of-the-art performance and, for the first time, the ability to adapt the importance of the parameters based on unlabeled data towards what the network needs (not) to forget, which may vary depending on test conditions.Comment: ECCV 201

arXiv.org e-Print Archive

Crossref

Evaluation of laser range-finder mapping for agricultural spraying vehicles

Author: A Nüchter
AM Endalew
F Rovira-Más
F-A Moreno
H Jeon
J Anthonis
J Wei
M Magnusson
P Walklate
S Pedersen
S Thrun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/08/2013
Field of study

In this paper, we present a new application of laser range-finder sensing to agricultural spraying vehicles. The current generation of spraying vehicles use automatic controllers to maintain the height of the sprayer booms above the crop. However, these control systems are typically based on ultrasonic sensors mounted on the booms, which limits the accuracy of the measurements and the response of the controller to changes in the terrain, resulting in a sub-optimal spraying process. To overcome these limitations, we propose to use a laser scanner, attached to the front of the sprayer's cabin, to scan the ground surface in front of the vehicle and to build a scrolling 3d map of the terrain. We evaluate the proposed solution in a series of field tests, demonstrating that the approach provides a more detailed and accurate representation of the environment than the current sonar-based solution, and which can lead to the development of more efficient boom control systems

University of Lincoln Institutional Repository

Crossref

Appearance-based localization for mobile robots using digital zoom and visual compass

Author: Ballard
Bovik
Burgard
Dempster
Duckett
Duckett
E. Fletcher
Jensfelt
K. Burn
Lowe
Moravec
N. Bellotto
Oore
S. Wermter
Thrun
Zuidervel
Publication venue: 'Elsevier BV'
Publication date: 18/07/2007
Field of study

This paper describes a localization system for mobile robots moving in dynamic indoor environments, which uses probabilistic integration of visual appearance and odometry information. The approach is based on a novel image matching algorithm for appearance-based place recognition that integrates digital zooming, to extend the area of application, and a visual compass. Ambiguous information used for recognizing places is resolved with multiple hypothesis tracking and a selection procedure inspired by Markov localization. This enables the system to deal with perceptual aliasing or absence of reliable sensor data. It has been implemented on a robot operating in an office scenario and the robustness of the approach demonstrated experimentally

University of Lincoln Institutional Repository

Crossref

Sunderland University Institutional Repository

Temporal Correlations and Persistence in the Kinetic Ising Model: the Role of Temperature

Author: Barraquand J.
Kwon W H
Lawler E.L.
Lozano-Perez T.
Lozano-Perez T.
Mayne D Q
Schwartz J. T.
Soueres P.
Thrun S.
Udupa S.
Publication venue
Publication date: 06/12/2000
Field of study

We study the statistical properties of the sum

S_t=\int_{0}^{t}dt' \sigma_{t'}

, that is the difference of time spent positive or negative by the spin

\sigma_{t}

, located at a given site of a

D

-dimensional Ising model evolving under Glauber dynamics from a random initial configuration. We investigate the distribution of

S_{t}

and the first-passage statistics (persistence) of this quantity. We discuss successively the three regimes of high temperature (

T>T_{c}

), criticality (

T=T_c

), and low temperature (

T<T_{c}

). We discuss in particular the question of the temperature dependence of the persistence exponent

\theta

, as well as that of the spectrum of exponents

\theta(x)

, in the low temperature phase. The probability that the temporal mean

S_t/t

was always larger than the equilibrium magnetization is found to decay as

t^{-\theta-\frac12}

. This yields a numerical determination of the persistence exponent

\theta

in the whole low temperature phase, in two dimensions, and above the roughening transition, in the low-temperature phase of the three-dimensional Ising model.Comment: 21 pages, 11 PostScript figures included (1 color figure

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Deep Blue Documents at the University of Michigan

Adding New Tasks to a Single Network with Weight Transformations using Binary Masks

Author: A Mallya
BM Lake
I Kuzborskij
J Kirkpatrick
J Stallkamp
M McCloskey
M Ristin
Mathias Eitz
O Russakovsky
RM French
S Munder
S Thrun
T Mensink
Z Li
Publication venue
Publication date: 14/06/2018
Field of study

Visual recognition algorithms are required today to exhibit adaptive abilities. Given a deep model trained on a specific, given task, it would be highly desirable to be able to adapt incrementally to new tasks, preserving scalability as the number of new tasks increases, while at the same time avoiding catastrophic forgetting issues. Recent work has shown that masking the internal weights of a given original conv-net through learned binary variables is a promising strategy. We build upon this intuition and take into account more elaborated affine transformations of the convolutional weights that include learned binary masks. We show that with our generalization it is possible to achieve significantly higher levels of adaptation to new tasks, enabling the approach to compete with fine tuning strategies by requiring slightly more than 1 bit per network parameter per additional task. Experiments on two popular benchmarks showcase the power of our approach, that achieves the new state of the art on the Visual Decathlon Challenge

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Archivio della ricerca- Università di Roma La Sapienza

Meta-Tracker: Fast and Robust Online Adaptation for Visual Object Trackers

Author: B Babenko
D Held
H Grabner
J Schmidhuber
JF Henriques
K Zhang
L Bertinetto
M Kristan
O Russakovsky
S Hare
S Hochreiter
S Thrun
Y Wu
Y-X Wang
Z Kalal
Publication venue
Publication date: 19/03/2018
Field of study

This paper improves state-of-the-art visual object trackers that use online adaptation. Our core contribution is an offline meta-learning-based method to adjust the initial deep networks used in online adaptation-based tracking. The meta learning is driven by the goal of deep networks that can quickly be adapted to robustly model a particular target in future frames. Ideally the resulting models focus on features that are useful for future frames, and avoid overfitting to background clutter, small parts of the target, or noise. By enforcing a small number of update iterations during meta-learning, the resulting networks train significantly faster. We demonstrate this approach on top of the high performance tracking approaches: tracking-by-detection based MDNet and the correlation based CREST. Experimental results on standard benchmarks, OTB2015 and VOT2016, show that our meta-learned versions of both trackers improve speed, accuracy, and robustness.Comment: Code: https://github.com/silverbottlep/meta_tracker

arXiv.org e-Print Archive

Crossref

Experimental analysis of sample-based maps for long-term SLAM

Author: Anguelov D.
Biber P.
Biber P.
Biber P.
Duda R.O.
Grossberg S.
Peter Biber
Stachniss C.
Sutton R.S.
Thrun S.
Tom Duckett
Wang C.-W.
Yamauchi B.
Zimmer U.
Publication venue: 'SAGE Publications'
Publication date: 01/01/2008
Field of study

This paper presents a system for long-term SLAM (simultaneous localization and mapping) by mobile service robots and its experimental evaluation in a real dynamic environment. To deal with the stability-plasticity dilemma (the trade-off between adaptation to new patterns and preservation of old patterns), the environment is represented at multiple timescales simultaneously (5 in our experiments). A sample-based representation is proposed, where older memories fade at different rates depending on the timescale, and robust statistics are used to interpret the samples. The dynamics of this representation are analysed in a five week experiment, measuring the relative influence of short- and long-term memories over time, and further demonstrating the robustness of the approach

University of Lincoln Institutional Repository

CiteSeerX

Crossref

Mo Músaem Fíorúil: a web-based search and information service for museum visitors

Author: C. Schmid
C. Schmid
D. Ballard
D. Lowe
K. Mikolajczyk
S. Thrun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Abstract. We describe the prototype of an interactive, web-based, museum artifact search and information service. Mo Músaem Fíorúil clusters and indexes images of museum artifacts taken by visitors to the museum where the images are captured using a passive capture device such as Microsoft's SenseCam [1]. The system also matches clustered artifacts to images of the same artifact from the museums o cial photo collection and allows the user to view images of the same artifact taken by other visitors to the museum. This matching process potentially allows the system to provide more detailed information about a particular artifact to the user based on their inferred preferences, thereby greatly enhancing the user's overall museum experience. In this work, we introduce the system and describe, in broad terms, it's overall functionality and use. Using different image sets of artificial museum objects, we also describe experiments and results carried out in relation to the artifact matching component of the system

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service